Improving the Estimation of Word Importance for News Multi-Document Summarization
نویسندگان
چکیده
In this paper, we propose a supervised model for ranking word importance that incorporates a rich set of features. Our model is superior to prior approaches for identifying words used in human summaries. Moreover we show that an extractive summarizer which includes our estimation of word importance results in summaries comparable with the state-of-the-art by automatic evaluation. Disciplines Computer Engineering | Computer Sciences Comments University of Pennsylvania Department of Computer and Information Science Technical Report No. MSCIS-14-02. This technical report is available at ScholarlyCommons: http://repository.upenn.edu/cis_reports/989 Improving the Estimation of Word Importance for News Multi-Document Summarization Extended Technical Report Kai Hong University of Pennsylvania Philadelphia, PA, 19104 [email protected] Ani Nenkova University of Pennsylvania Philadelphia, PA, 19104 [email protected]
منابع مشابه
Improving the Estimation of Word Importance for News Multi-Document Summarization - Extended Technical Report
In this paper, we propose a supervised model for ranking word importance that incorporates a rich set of features. Our model is superior to prior approaches for identifying words used in human summaries. Moreover we show that an extractive summarizer which includes our estimation of word importance results in summaries comparable with the state-of-the-art by automatic evaluation. Disciplines Co...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملUse of Multiple Features for Extracting Topics from News Clusters
In this paper we consider a method for extraction of sets of semantically similar language expressions representing different participants of the text story – thematic nodes. The method is based on the structural organization of news clusters and exploits comparison of various contexts of words. The word contexts are used as a basis for multiword expression extraction and thematic node construc...
متن کاملImproving the Performance of the Random Walk Model for Answering Complex Questions
We consider the problem of answering complex questions that require inferencing and synthesizing information from multiple documents and can be seen as a kind of topicoriented, informative multi-document summarization. The stochastic, graph-based method for computing the relative importance of textual units (i.e. sentences) is very successful in generic summarization. In this method, a sentence...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014